25 research outputs found

    KAFnotator: a multilingual semantic text annotation tool

    Get PDF
    At present, the availability of high quality an- notated corpora is fundamental to carry out or to evaluate several Natural Language Process- ing and Text Mining tasks. To create consis- tently annotated corpora, direct human inter- vention represents a key factor: teams of man- ual taggers, usually composed by linguistically skilled people, are needed to refine existing annotations or to add new ones. As a conse- quence, manual corpora annotation is an ex- pensive and a highly demanding task in term of involved resource

    ItaliaNLP @ TAG-IT: UmBERTo for Author Profiling at TAG-it 2020

    Get PDF
    In this paper we describe the systems we used to participate in the task TAG-it of EVALITA 2020. The first system we developed uses linear Support Vector Machine as learning algorithm. The other two systems are based on the pretrained Italian Language Model UmBERTo: one of them has been developed following the Multi-Task Learning approach, while the other following the Single-Task Learning approach. These systems have been evaluated on TAG-it official test sets and ranked first in all the TAG-it subtasks, demonstrating the validity of the approaches we followed

    Database Models and Data Formats

    Get PDF
    The deliverable describes data structure and XML formats that have been investigated and defined for data representation of linguistic and semantic resources underlying the KYOTO system

    EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020

    Get PDF
    Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it)

    Creative Commons: a user guide

    No full text
    Here is an operational manual which guides creators step by step in the world of Creative Commons licenses, the most famous and popular licenses for free distribution of intellectual products. Without neglecting useful conceptual clarifications, the author goes into technical details of the tools offered by Creative Commons, thus making them also understandable for total neophytes. This is a fundamental book for all those who are interested in the opencontent and copyleft world

    Introducing CAPER, a Collaborative Platform for Open and Closed Information Acquisition, Processing and Linking

    No full text
    The goal of CAPER is to create a common platform for the prevention of organised crime through sharing, exploitation and linking of Open and Closed information sources. CAPER will support collaborative multilingual analysis of audiovisual content and biometrics information, based on Visual Analytics and Text Mining technologies. CAPER will permit Law Enforcement Agencies (LEAs) to share informational, investigative and experiential knowledge. The paper will detail the CAPER platform elements:Open and Closed Data Sources: TV, Radio, and Information in closed legacy systems are the data sources to be mined and evaluated by CAPER, in addition to Open Internet data sources, and Semantic Web data collections.Data Acquisition: Depending on the information source type, different acquisition patterns will be applied to ensure acquired information is the richest possible and has a suitable format for analysis. Information Analysis: Each analysis module is geared towards a specific content type, i.e. Text, Image, Video, Audio and Speech or Biometric data. These modules interact with the ?Semantic mash-up? component, to link Semantic Web data. Information and Reference Repositories: source data and mined information will be stored in these repositories, separated by content type. Repositories will also store the reference images, text, keywords, biometric data etc. of interest to the LEAs,Interoperability and Management Application: This is the end users? workbench., built on a web based collaborative platform. It will allow LEAs to create and configure their monitoring requests and analysis petitions. Visual Analytics (VA) and Data Mining (DM): VA and DM will provide the intelligence necessary to support the output of the system. They will allow LEAs to effectively mine processed data both from Closed and Open information sources, and to further relate it to Semantic Web sources when required

    Open Source, Software libero e altre libertà

    No full text
    Questo libro ci presenta una visione d’insieme sul mondo dell’openness, dal software open source ai servizi cloud, passando per le licenze Creative Commons e gli open data. Tra i primi ad averne fatto l’oggetto principale della propria attività scientifica e professionale, per la prima volta nel panorama italiano l’autore ci offre un quadro sistematico di come realizzare l’apertura e la condivisione di beni intellettuali che altrimenti rimarrebbero riservati. Un fenomeno che, da marginale, è divenuto centrale in vari contesti. L’opera si presenta al lettore in modo chiaro ma non banale. L’autore affronta temi complessi con stile lineare e comprensibile anche ai digiuni di diritto, ma conservando un necessario rigore scientifico, fornendo gli elementi per una teoria generale per la creazione e la promozione di beni intellettuali comuni, o commons. Un approccio rivoluzionario a questa branca del diritto, che capovolge l’uso di strumenti ben noti come il copyright e le licenze, richiedeva l’intervento di un giurista fuori dagli schemi e anticonformista come Piana

    Extracting Events from Wikipedia as RDF Triples Linked to Widespread Semantic Web Dataset

    No full text
    Many attempts have been made to extract structured data from Web resources, exposing them as RDF triples and interlinking them with other RDF datasets: in this way it is possible to create clouds of highly integrated Semantic Web data collections. In this paper we describe an approach to enhance the extraction of semantic contents from unstructured textual documents, in particular considering Wikipedia articles and focusing on event mining. Starting from the deep parsing of a set of English Wikipedia articles, we produce a semantic annotation compliant with the Knowledge Annotation Format (KAF). We extract events from the KAF semantic annotation and then we structure each event as a set of RDF triples linked to both DBpedia and WordNet. We point out examples of events automatically mined from a set of Wikipedia documents, providing some general evaluation of how our approach may discover new events and link them to existing contents

    GLOSS, an infrastructure for the semantic annotation and mining of documents in the public security domain

    No full text
    Efficient access to information is crucial in the work of organizations that require decision taking in emergency situations. This paper gives an outline of GLOSS, an integrated system for the analysis and retrieval of data in the environmental and public security domain. We shall briefly present the GLOSS infrastructure and its use, and how semantic information of various kinds is integrated, annotated and made available to the final users
    corecore